06. Denormalization in Apache Cassandra
05 Denomorlization In Apache -
Note of correction:
At 2:55 of the video, the instructor says "Losing customers to outages or low latency is not [inexpensive]."
She should have said "Losing customers to outages or poor performance is not [inexpensive]."
Data Modeling in Apache Cassandra:
- Denormalization is not just okay -- it's a must
- Denormalization must be done for fast reads
- Apache Cassandra has been optimized for fast writes
- ALWAYS think Queries first
- One table per query is a great strategy
- Apache Cassandra does not allow for JOINs between tables
Commonly Asked Questions:
- I see certain downsides of this approach, since in a production application, requirements change quickly and I may need to improve my queries later. Isn't that a downside of Apache Cassandra?
In Apache Cassandra, you want to model your data to your queries, and if your business need calls for quickly changing requirements, you need to create a new table to process the data. That is a requirement of Apache Cassandra. If your business needs calls for ad-hoc queries, these are not a strength of Apache Cassandra. However keep in mind that it is easy to create a new table that will fit your new query.
Additional Resource:
Here is a reference to the DataStax documents on [Apache Cassandra].(https://docs.datastax.com/en/dse/6.7/cql/cql/ddl/dataModelingApproach.html)